Ford GoBike System Data¶

by (Amr saeed)¶

This data set of information about individual rides made in a bike-sharing system from the greater San Francisco Bay area. this dataset consist of 183412 trip,db image taken @2019

Investigation Overview¶

In analytics, what I want is to show the analytical differences between the different elements in the available data How does the trip time affect the location of its start and the gender of the rider

Dataset Overview¶

the structure of data has 16 features and 183412 trip those are (duration_sec, start_time, end_time, start_station_id, start_station_name, start_station_latitude, start_station_longitude, end_station_id, end_station_name, end_station_latitude ,end_station_longitude, bike_id, user_type, member_birth_year, member_gender, bike_share_for_all_trip).

I had¶

1- drop null vlaues¶

2- scaling duration in seconds to duration in minutes¶

3- dropping [other] type of gender for simplecity¶

4- scaling [member_birth_year] feature to [member_age]¶

Note that the above cells have been set as "Skip"-type slides. That means that when the notebook is rendered as http slides, those cells won't show up.

Trip Duration in Minute¶

as we see the first plot is not clear enough to fitch the most tripes duration distribution so I took the duration only to the range of 100 minute to provide better clearance¶

conclusion¶

the most trips are between 5 to 15 minutes (as the normal distribution centered) and shown above¶

plotting top 60 start trip distinations¶

the top distnation is ("market St") & ("san francisco Station 2")¶

I conclude that those two stations have the highest population so we need to provide more bikes¶

plotting the start hour of the trip¶

The trip distribution over day hours peaks around two timeframes, 7am-9am and 4pm-6pm, during typical rush hours.¶

trip duration according to each gender¶

stands on the univariate resultes active users grouped at 20:45 years-old¶

concluded that:¶

> 1- the week days are the working and collage days¶
> 2- the day hours are working start & end hours of work¶
> 3- that workers and collage students are the top clients¶
Text(0.5, 1.0, 'the trip duration according to agent gender')

Once you're ready to finish your presentation, check your output by using nbconvert to export the notebook and set up a server for the slides. From the terminal or command line, use the following expression:

jupyter nbconvert <file_name>.ipynb --to slides --post serve --template output_toggle

This should open a tab in your web browser where you can scroll through your presentation. Sub-slides can be accessed by pressing 'down' when viewing its parent slide. Make sure you remove all of the quote-formatted guide notes like this one before you finish your presentation!

conclusion¶

> from previous plot we get that the male users are usimg more rides while comparing with female and other with long durations , other types of customers are taking long rides while they are older (50 - 60)¶

>The multivariate exploration confirmed the previous explorations and figures.¶

>The rides are mainly concentrated on rush hours Monday through Friday,¶

indicates that workers and collage students are the top clients,¶

>The longest rides are in weekends due to jamming.¶

the number of users for male is higher but percentage is higher for women in trip duration.¶

> the week days are the working and collage days¶

> the day hours are working start & end hours of work¶

> that workers and collage students are the top clients¶

> the top crowded station at rush hours varing from the the crowded station during the day¶

and¶

> I conclude that people avoid crowds at peak times unless the trip is a working or study trip¶